Methods and systems for low-power and pin-efficient communications with superposition signaling codes

ABSTRACT

A communication system uses a bus to transmit information, by receiving signals and mapping them to a second set of signals representing codewords of a superposition signaling code, and transmitting the second set of signals. The superposition signaling code can comprise more than one layer. The pin-efficiency can be larger than 1. The system may encode bits into a codeword of a superposition signaling code that is defined by two basis vectors of predetermined size and then have two encoders for permutation modulation codes defined by the basis vectors. The bits of information are divided into a first part representing a predetermined number of bits and a second part representing a predetermined number of bits, with the parts provided to the respective encoding circuits and their outputs combined by a superposition.

CROSS-REFERENCES TO RELATED APPLICATIONS

The following references are herein incorporated by reference in their entirety for all purposes:

-   U.S. patent application Ser. No. 12/784,414, filed May 20, 2010,     naming Harm Cronie and Amin Shokrollahi, entitled “Orthogonal     Differential Vector Signaling” (hereinafter “Cronie I”). -   U.S. patent application Ser. No. 12/982,777, filed Dec. 30, 2010,     naming Harm Cronie and Amin Shokrollahi, entitled “Power and Pin     Efficient Chip-to-Chip Communications with Common-Mode Resilience     and SSO Resilience” (hereinafter “Cronie II”). -   U.S. patent application Ser. No. 13/030,027, filed Feb. 17, 2011,     naming Harm Cronie, Amin Shokrollahi and Armin Tajalli entitled     “Methods and Systems for Noise Resilient, Pin-Efficient and Low     Power Communications with Sparse Signaling Codes” (hereinafter     “Cronie III”).

REFERENCES

-   [Poulton] U.S. Pat. No. 6,556,628 B1 to John W. Poulton, Stephen G.     Tell and Robert E. Palmer, entitled “Methods and Systems for     Transmitting and Receiving Differential Signals over a Plurality of     Conductors”. -   [Chiarulli] U.S. Pat. No. 7,358,869 to Donald M. Chiarulli and     Steven P. Levitan, entitled “Power Efficient, High Bandwidth     Communication using Multi-Signal-Differential Channels”. -   [Slepian] D. Slepian, “Permutation Modulation,” published in the     proceedings of the IEEE, Vol. 53, No. 3, March 1965, pages 228-236. -   [Horowitz] U.S. Pat. No. 7,142,612 B2 to Mark A. Horowitz, Scott C.     Best and William F. Stonecypher, entitled “Method and Apparatus for     Multi-Level Signaling”.

FIELD OF THE INVENTION

The present invention relates to communications in general and in particular to transmission of signals capable of conveying information, wherein power consumption, pin-efficiency, SSO noise and common-mode noise are constraints.

BACKGROUND OF THE INVENTION

In communication systems, the goal is to transport information from one physical location to another. It is preferred that the transport of this information is reliable, is fast and consumes a minimal amount of resources. In most cases, there are trade-offs that need to be dealt with. For example, error-free communications might be possible if there were no bandwidth, latency or power-consumption constraints on a communication channel and system, but in most applications there are some constraints that require some design for robustness. Each of these constraints can be measured and designed for.

For example, bandwidth is often measured in the number of bits representing the information being conveyed per unit time. Sometimes, information is conveyed in non-binary form, but that can typically be considered as binary data anyway. For example, if information is represented by a sequence of symbols selected from a symbol alphabet of four symbols, each symbol can represent conveyance of two bits. With symbol alphabets having a size that is not a power of two, fractional bits are involved or rounding is involved, as is well-known in the field of information theory. It is also known that some number of bits or symbols might have different information content and that bits or symbols with low information content, where the excess is often considered a redundancy that can be compressed out, used for error detection or correction, or for other uses. In some designs, bandwidth is measured by bits/second or symbols/second of non-redundant information conveyed or encoded information conveyed (possibly having some redundancy as required for error handling or other signaling).

Latency is often measured by the time delay between a sender having information (in the form of data, a message, signals, etc.) ready to send to a receiver over a channel and the time when the receiver can use that information. For example, for a channel between two chips, if there are eight wires and eight independent bits are conveyed at a time, the latency might be just the time of flight (signal speed times length of wire) plus the time needed to charge and discharge transistors or other circuit elements in the path. However, where some computation is needed to encode and decode the information, the time needed for that computation is added to the latency.

Power-consumption is often expressed as a function of the number of operations (logical, computing, etc.) that are needed per unit bits to communicate them. For example, a design where six logical operations are needed to convey eight bits of information, that might be considered twice as expensive, power-consumption wise, as a design where three logical operations are needed to convey eight bits of information. Often, what is important in a design is the average effort, as that more often reflects on the total power consumption, heat dissipation, etc. as compared with the minimum and maximum values. It is known that with a fast processor and proper coding, many operations per second can be performed with arbitrary complexity. For example, a conventional packet router often has a processor or a specialized chip that reads instructions from an instruction memory and performs the instructed operation. In such cases, the power-consumption measurement has to fairly include the power consumption of the processor, the instruction memory, etc., albeit amortized over the number of bits conveyed.

As is well-known, a processor and memory can perform arbitrarily complex operations (e.g., table lookups, Fast Fourier transformations, multiplications, arbitrary finite algebra, etc.) that cannot be performed by two or three logic gates, but some designs are so power-consumption and latency limited that the design only allows for a few logic gates. For example, if the communication channel is between two parts of a chip or between two chips, it might not be practical to insert a processor and instruction memory between those two parts, consuming chip or board real estate as well as power, in order to effect some particular communications scheme. As an example, network communications between two microwave relay stations might be made reliable by using transmission powers of multiple watts, multiple levels of convolutional coding, handshaking for acknowledgements and retransmission of lost packets, and the like, where the power needed to perform the computations is largely irrelevant relative to the power needed to send the signal over the medium, but none of that would be practical for communications between two chips on a circuit board or between two circuits on a chip where the power needed to perform any needed computational, circuit or logical operations would be much larger than what is needed to actual convey the signals.

Another consideration in a design is signal noise. To send more robust signals, the amplitude of the signals can be increased, but that might lead to increased noise in other parts of the channel or system, in addition to increased power consumption. Thus, all other things being equal, the system that uses lower amplitude signals might be preferred.

In many electronic devices, communication plays an important role regardless of the function these electronic devices fulfill. Most modern electronic devices contain integrated circuits (“IC”) that exchange information with one another. The information may also be exchanged between ICs connecting two different devices. In general, one refers to these two communication settings as chip-to-chip communication.

In chip-to-chip communication, the physical transfer of information takes place over a transmission medium that may use multiple transmission paths. Each of these transmission paths is able carry a signal measurable by some physical characteristic or quantity, such as an electric voltage signal, electric current signal, light intensity signal or other electromagnetic field strength measurable on a wire, fiber or other medium. Herein, for readability, such paths/media for transmission is referred to as a “wire” without intending to limit wires to specific examples and the physically measurable characteristic, quantity or phenomenon is referred to as a “signal” on that wire. While distinct wires may carry distinct signals, it is often the case that signals on one wire will induce undesirable signals on another wire, either by signal-to-signal noise, common mode noise or other known phenomenon.

Depending on the exact application, a wire may be a micro-strip on a printed circuit board (“PCB”), a metal wire integrated on an IC and connecting two components within the same chip and/or a bond wire connecting two chips that are mounted on top of each other in a package-on-package configuration. As used herein, “wire” can refer to any physical path between one IC to another that can carry a signal on that wire. This physical path may comprise several parts, such as a metal trace on a PCB, a bond wire, a coupling capacitor, a “through silicon via” (“TSV”) and a connector. It is to be understood that these parts may be included in the concept of a wire. Multiple wires may be used to communicate and multiple wires in parallel constitute a communications bus. Important parameters in chip-to-chip communications are the communication speed, the power consumption, the physical footprint of the communication bus and electronics, and the error performance. In most chip-to-chip communication systems, the error performance has to be very low (e.g., less than one error in 10¹² bits) and a certain amount of energy is required to achieve the error performance.

The pin-efficiency, r, of a chip-to-chip communication system is defined as the number of bits transmitted per wire in each communication interval. The communication interval, T, is often small and may be in the order of, e.g., 10⁻¹⁰ seconds or less. Multiple wires may be used to achieve a required aggregate data rate. A high pin-efficiency is preferred over a low pin-efficiency since the former allows one to have a smaller physical footprint for the same total data rate. Furthermore, if one is able to increase the pin-efficiency, the frequency of communication may be lowered to achieve the target aggregate data rate which immediately leads to lower power consumption in most chip-to-chip communication systems.

There are several reasons why it is difficult to design high speed, low power, small footprint and low error-rate chip-to-chip communication systems. First, communication is not perfect and the signals transmitted on the wires are disturbed by noise and interference. In chip-to-chip communications some sources of noise are thermal noise, common-mode noise and interference, crosstalk, reference noise and switching noise. An important type of switching noise is simultaneous switching output (“SSO”) noise. SSO noise plays a large role at higher communication frequencies. The resilience against some of these noise types may be increased by increasing the transmit power. For others, such as SSO noise and crosstalk, increasing the transmit power is not beneficial since these noise types tend to increase as well.

A second impairment of chip-to-chip communications is that the physical medium tends to attenuate the signals transmitted on the wires. Especially when the data rate increases and/or the wires become longer, attenuation will increase. This requires one to increase transmission power burned in the drivers and use equalization methods. Since a large part of the power consumption consists of the power burned in the drivers the total power consumption has to increase to combat the attenuation.

To combat the issues with respect to common-mode noise, crosstalk, and SSO noise many chip-to-chip communication systems use differential signaling. In differential signaling an information carrying signal is encoded into the difference of two signals. Each of these encoded signals is transmitted on a separate wire. Differential signaling provides immunity against common-mode noise and interference and one can use a transmitter architecture that minimizes SSO noise. A major downside of differential signaling is that the pin-efficiency is only r=0.5. To achieve high data rates one would have to run at very high frequencies where attenuation is high. One would like to use signaling methods for chip-to-chip communications that preserve the excellent properties of differential signaling but operate at a significantly higher pin-efficiency. Furthermore, these methods should allow for an efficient implementation in terms of circuitry and lead to a power-efficient operation.

Some systems have been devised to deal with the constraints explained above, including some previously developed by the inventors named in the present application.

FIG. 1 illustrates generally a conventional chip-to-chip communication system that uses differential signaling. The system is shown comprising a transmit unit 100 connected by a communication bus 120 to a receive unit 150. Transmit unit 100 comprises a driver unit 110 that drives two wires 122 of bus 120. Driver unit 110 generates two signals 112 and 114, denoted in the figure by s₀ and s₁, based on the information to be transmitted on bus 120. Driver 110 may drive the wires of bus 120 in voltage-mode or current-mode. Bus 120 may be terminated at the receiver by a termination resistor 130 and at the transmitter by a termination resistor 132. A differential amplifier or comparator 140 measures the voltage across termination resistor 130 and detects the data transmitted on bus 120.

For differential signaling, these two signals satisfy s₀=−s₁ and this property gives differential signaling its excellent properties with respect to common-mode noise and crosstalk. Driver 110 may perform additional tasks, such as amplification, pre-emphasis and equalization. Differential amplifier 140 may perform additional tasks, such as de-emphasis, equalization and equalization. By “perform tasks”, it should be understood that performance can be implemented using particular circuitry and/or physically-implemented logical elements.

FIG. 1 also shows that transmit unit 100 is connected to a positive terminal 160 of a power supply (not shown) and a ground terminal 162. Terminals 150 and 162 supply transmit unit 100 with a voltage of Vdd volts. The circuitry of transmit unit 100 requires a power supply to operate. A parasitic inductor 164 impairs the connection to Vdd, while a parasitic inductor 166 impairs the connection to ground terminal 162. Parasitic inductors 164, 166 might be a result of, e.g., a bondwire and/or impedance discontinuity in an IC package, as is known. When circuitry in transmit unit 100 causes variations in currents through parasitic inductors 164, 166, a voltage develops across those parasitic inductors, which causes a drop in power supply voltage for circuitry in transmit unit 100 and this may cause the signals transmitted on bus 120 to be disturbed. The time-varying current through parasitic inductors 164, 166 is largely determined by the signaling method. For binary differential signaling, the variation is minimal, since s₀=−s₁.

FIG. 2 illustrates the use differential signaling with multiple wires, on a chip-to-chip communication system. FIG. 2 shows a chip-to-chip communication system where communication takes place over a bus 220 comprising 2n wires 235 between a transmit unit 200 and a receive unit 250. Transmit unit 200 comprises n drivers 260 that each implement differential signaling of an input signal (not shown). Each of these drivers 260 is connected to a different pair of wires of communication bus 220. The wires 235 of communication bus 220 can be terminated at transmit unit 200 and/or receive unit 250. At receive unit 250, differential receivers or comparators 270 sense the signals on each pair of wires. Drivers 260 in transmit unit 200 are connected to a positive power supply 280 and a ground 282. Both connections are through parasitic inductors 284 and 286, respectively. Since differential transmitters are used, the variation of the currents through parasitic inductors 284 and 286 may be small. The reason for this is clear when binary differential signaling is used and the bus is driven in current-mode. In that instance, each of drivers 260 sources a current of some strength I into one of the wires of the wire pair and sinks a current of that strength I from the other wire of the wire pair. The sum of all currents that is sourced by drivers 260 is supplied though parasitic inductor 284 and is constant. The sum of all currents that is sunk by drivers 260 is sunk into ground 282 through parasitic inductor 286 and is constant as well. Hence the introduction of SSO is minimized. One may require multiple connections to Vdd and ground to limit the current through each of these connections. SSO is also minimized, as should be apparent.

A major drawback of using binary differential signaling is that the pin-efficiency, r, is only r=0.5. To achieve a bitrate of f_(b) bits per wire, the symbol rate or frequency of operation per wire has to be 2f_(b). For high-speed operation and/or for longer transmission paths, the amount of power spent in the drivers has to increase substantially to mitigate the effects of attenuation. To achieve a higher pin-efficiency with differential signaling, one can opt for multi-level differential signaling. Although this leads to higher pin-efficiencies, the required transmission power to assure reliable communication may increase faster than the advantages obtained from a potentially lower symbol rate.

In [Poulton], a multi-wire differential signaling scheme was proposed that indicated a potential to obtain higher pin-efficiencies than differential signaling. The scheme disclosed in [Poulton] retains several of the noise resilience properties of differential signaling, but creates other problems.

[Poulton] only teaches how to implement pin-efficient schemes for three and four wires, but often transmission takes place over a bus of more than four wires. Also, as pointed out in [Poulton], the method disclosed there is not very power-efficient. A signaling method that achieves high pin-efficiency and is power efficient is therefore preferred. Finally, encoding and decoding the signaling method as disclosed in [Poulton] is not straightforward, especially when the number of wires is more than four and therefore much power might be consumed in just the processing and handling of signaling.

[Horowitz] describes a signaling method that reduces SSO. The method in [Horowitz] is based on multi-level signaling where the sum of the signal levels transmitted on the bus is kept close to each other from bus cycle to bus cycle. There are several problems to this approach. First, for the method to have maximum effect, all driver circuitry has to use a single connection to Vdd and a single connection to ground. This is only possible for a small number of bus wires such that the total required current can be limited. Second, encoding and decoding such signaling schemes is only possible for a small number of bus wires to avoid overly complex encoding and decoding. Third, introducing memory between consecutive bus cycles increases the delay of the bus communication system.

In Cronie I, orthogonal differential vector signaling (“ODVS”) is described. ODVS allows for chip-to-chip communications with a pin-efficiency larger than that of differential signaling (up to r=1.0) with a resilience against several types of noise similar to that of differential signaling. Where even larger pin-efficiencies are needed, the teachings of Cronie II and Cronie III can be applied.

Cronie II and Cronie III teach that spherical codes can be used to obtain pin-efficiencies larger than r=1.0. In some embodiments, these spherical codes are permutation modulation codes (as in Cronie II) or sparse signaling codes (as in Cronie III). These codes lead to pin-efficient and noise resilient chip-to-chip communications, while keeping the power consumption of transmitter and receiver low compared to conventional signaling methods. In certain cases, encoding the signaling schemes of Cronie III for pin-efficiencies larger than r=1.0 can be simplified. Furthermore, for high pin-efficiencies and some of the signaling schemes of Cronie III, hardware architectures that effectively mitigate SSO noise can also be simplified.

Yet some application might still require more power-efficient signaling methods that provide a pin-efficiency larger than r=0.5 and provide SSO resilience with an efficient hardware architecture, with a good performance with respect to common-mode noise and interference.

BRIEF SUMMARY

Embodiments of the present invention provide for processes and apparatus for transmitting data over physical channels such that the signals transmitted are resilient to common mode noise, do not require a common reference at the transmission and reception points, involve a pin-efficiency that is greater than 100%, with relatively low power dissipation for encoding and decoding. Corresponding decoders at reception points are also disclosed.

In a particular embodiment, information is transmitted over a communication bus by receiving a first set of signals representing the information, mapping the first set of signals to a second set of signals, wherein the second set of signals comprises one or more codeword selected from among the valid codewords of a superposition code, and providing the second set of signals for transmission over the communication bus. A corresponding decoder decodes the second set of signals (possibly altered by the communication bus) in an attempt to recover a replication of the first set of signals while reducing the amount of energy needed to do so.

In some embodiments, a transmit unit receives a number of bits of information to be conveyed to a receive unit and receives that number of bits in each of a plurality of time periods such that the receive unit can recover (at least approximately) that number of bits in each of the time periods (possibly with some latency). The transmit unit considers those number of bits, splits them into two or more input bit vectors, maps each of the two or more input bit vectors onto an element of a signal constellation associated with each input bit vector and the combined, to form a plurality of signals for a plurality of wires of a bus between the transmit unit and the receive unit. In particular, the transmit unit might receive k bits in a period, split those k bits into l parts (l being greater than one) and, for each part, apply the part to a permutation modulation basis vector for that part, thus forming l permutation vectors, that are combined to form a set of signals output, for that period, on the wires of the bus, where the number of wires, n, on the bus is equal to k/l when all of the l parts are of the same size. The use of a plurality of permutation vectors and their combination forms a “superposition signaling code” scheme.

An encoder (and corresponding decoder) can be defined by k, l, n, the particular permutation modulation (“PM”) basis vectors used, and possibly also the combination technique used. Thus, codewords from the superposition signaling code can be represented by vectors, with each vector comprising a plurality of vector components, wherein codewords are characterized as being combinations of some number, l (with l>1), of secondary codewords using a superpositioning operator. Each of the secondary codewords is representable by vectors belonging to some predetermined set of secondary codewords.

In some embodiments, the superpositioning operator adds the components of these l secondary codewords to obtain a superpositioning codeword that is to be sent over the communications channel. The number of secondary codewords, l, is herein referred to as the number of “layers” of the superposition code. In some embodiments, the secondary codewords may be obtained by different permutations of one or more basis vectors, giving rise to a set of secondary codewords that forms a permutation modulation code.

In some embodiments, the secondary codewords may have quiescent vector components and nonquiescent vector components, and the set of all possible secondary codewords can be from one sparse signaling code or from a union of several sparse signaling codes, wherein a sparse signaling code comprises vectors for which the number of quiescent vector components and nonquiescent vector components meets some sparseness requirement. One such sparseness requirement might be that a ratio of quiescent vector components to total vector components is greater than or equal to one-third. However, other sparseness requirements might be used instead. In specific examples, a quiescent vector component is represented by a value of zero, a zero voltage and/or a zero current, but the sparse code need not be limited to such examples. In general, a quiescent vector component is a vector component that does not lead to physical power transfer from one end to another end of a bus wire, or at least substantially less physical power transfer as compared with the physical power transfer caused by a nonquiescent vector component. The quiescent vector component is typically referred to herein as the “zero” symbol.

In some embodiments, different voltage, current, etc. levels are used for signaling and more than two levels might be used, such as a ternary sparse signaling code wherein each wire signal has one of three values. In some embodiments, there are no more than two nonquiescent vector components for each codeword vector and in some embodiments, at least half of the vector components of each codeword vector are quiescent vector components.

Hardware elements might include storage for symbols of input information used for selecting codewords, processing hardware to convert symbols to signals, parsing symbols into separate partitions, storing results, and providing the partitions in sequence as signals.

Various embodiments of the invention are given with reference to specific hardware implementations of small area and low power dissipation. The following detailed description together with the accompanying drawings will provide a better understanding of the nature and advantages of the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates generally a conventional chip-to-chip communication system that uses differential signaling.

FIG. 2 illustrates the conventional use differential signaling with multiple wires, on a chip-to-chip communication system.

FIG. 3 illustrates an overview of a chip-to-chip communication system that might use elements of the present invention.

FIG. 4 illustrates various examples for pulse shapes.

FIG. 5 illustrates an embodiment of the encoder of FIG. 3 in greater detail.

FIG. 6 illustrates an encoder that uses an index generator.

FIG. 7 is a flowchart of an encoding process that may be used by an index generator to define a permutation of a basis vector.

FIG. 8 is a flowchart of an encoding process that may be used by an index generator to define a permutation of a basis vector with a more general basis vector.

FIG. 9 is a flowchart of an encoding process that may be used by an index generator to define a permutation of a basis vector for an 8b8w code.

FIG. 10 is a mapping of bits to encodings, resulting from the process of FIG. 9.

FIG. 11 is a block diagram of an encoder for an 8b5w code.

FIG. 12 is a block diagram of an encoder for a 24b8w code.

FIG. 13 illustrates an example circuit that might be used for the driver in the circuit of FIG. 3.

FIG. 14 illustrates an example circuit that might be used for the driver in the circuit of FIG. 3, with controlled current sources.

FIG. 15 is a block diagram of a circuit for handling currents with both positive and negative values.

FIG. 16 is an illustration of examples for signals that are driven on a communication bus.

FIG. 17 illustrates a variation of a transmit unit wherein separate drivers appear prior to a combine unit.

FIG. 18 illustrates a transmit unit that might be used as the transmit unit in the circuit of FIG. 3.

FIG. 19 illustrates an example Signal-to-Digital Converter (“SDC”) as might be used for the SDC in the circuit of FIG. 3.

FIG. 20 illustrates an example decoder.

FIG. 21 illustrates a decoder that uses the properties of the underlying permutation modulation codes.

FIG. 22 illustrates a decoder for an 8b5w superposition signaling code.

FIG. 23 illustrates an SDC that incorporates common-mode cancellation.

FIG. 24 illustrates a transmit unit that uses superposition signaling codes to minimize introduction of SSO noise.

FIG. 25 illustrates an embodiment of a general driver that drives n bus wires.

FIG. 26 illustrates a driver in which combining is performed in analog.

FIG. 27 illustrates another embodiment of the encoder of FIG. 3 in greater detail.

DETAILED DESCRIPTION OF EMBODIMENTS

FIG. 3 illustrates an overview of a chip-to-chip communication system that might use elements of the present invention. It should be understood that the descriptions here are not meant to be limiting and one of ordinary skill in the art should understand, after reading this disclosure with the accompanying figures, that other variations are possible. As illustrated in FIG. 3, a source of information (or data) typically represented or representable by bits to be conveyed in each of a plurality of time periods, illustrated as source 400, provides k bits to a destination 480 per communication period, T, where source 400 and destination 480 might be chips or circuits that use the transmitted data in some fashion. The information flow is generally from source 400 to a transmit unit 410, to a bus 440, to a receive unit 450 and from there to destination 480. Preferably, the overall transmit rate, overall power consumption of elements per communication period, and the error rate are all within design constraints for the particular application, and do so with other constraints, such as a pin-efficiency constraint.

Communication bus 440 comprises n wires 445. Without loss of generality, one may assume that the information in the source 400 is available as a sequence of k bits. Furthermore, one may assume that in each communication period (time interval) of T seconds, the next sequence of k bits is available.

Transmit unit 410 may have two main tasks. First, the information may be encoded by an encoder 420 such that the original information from source 400 is protected with respect to several noise types that are present in chip-to-chip communications. Second, the transmit unit 410 may comprise a driver 430 that drives wires 445 of bus 440 based on the encoded data from encoder 420. At the other end of communication bus 440, receive unit 450 senses the signals at the output of communication bus 440 and attempts to recover the information that was transmitted by transmit unit 410. For this purpose, receive unit 450 may comprise a signal-to-digital converter (“SDC”) 460 and a decoder 470. SDC 460 senses the signals on bus 440 and may generate a discrete representation of the signals on bus 440. This representation is forwarded to decoder 470, which recovers the original information sent by source 400 and forwards the result to destination 480.

In a preferred embodiment, the transmit unit 410 generates a sequence of signals for the n wires of the bus. The signals are determined for each communication period based on the k bits to be sent, generally, as indicated by Equation 1, wherein the sequence of signals over a given communications period are represented by the set of signals s₀(t), . . . , s_(n−1)(t), where z represents an encoding of the k bits and p(t) represents a pulse shape (as a function of time).

$\begin{matrix} {{{Signal}\mspace{14mu}{Set}} = {\begin{bmatrix} {s_{0}(t)} \\ \vdots \\ {s_{n - 1}(t)} \end{bmatrix} = {{zp}(t)}}} & \left( {{Eqn}.\mspace{11mu} 1} \right) \end{matrix}$

In Equation 1, z is representable as a vector of size n and p(t) is a time-dependent scalar predetermined pulse shape. Each communications period corresponds to a time interval over which p(t) is defined and in which a new sequence of k bits may be available from source 400, from which encoder 420 may generate a new vector z based on those k bits.

FIG. 4 illustrates various examples for pulse shapes, p(t). The pulse shape might be chosen according to the application. FIG. 4(a) shows a square pulse shape 510; FIG. 4(b) shows a pulse shape 520 with a finite rise and fall time t_(r); FIG. 4(c) shows a pulse shape 530 that includes pre-emphasis; and FIG. 4(d) shows a periodic square pulse shape 540. Pulse shape 520 may show improved crosstalk performance compared to pulse shape 510. Pulse shape 530 may be used to combat high frequency losses. Other pulse shapes might be used instead, as might be known to one of ordinary skill in the art.

The set of all vectors z is referred to herein as “the signal constellation” S of the code. The choice of S affects performs of a chip-to-chip communication system, since S determines (at least in part) the system's resilience against noise, the pin-efficiency of the system and the power consumption of the system. The k information bits have to be mapped to the elements of S (in order that they may be uniquely recovered from the set of signals sent) and this may lead to additional complexity in terms of hardware. The choice of S has consequences for the required complexity. Ways to choose S such that one achieves a resilience against common-mode noise, a high pin-efficiency and a low power consumption while keeping the additional hardware complexity low are explained herein.

FIG. 5 illustrates an embodiment of encoder 420 of FIG. 3 in greater detail. Encoder 420 is shown comprising a plurality of permutation modulation encoders 550 that each encodes a permutation modulation code. Permutation modulation codes are a type of spherical codes. [Slepian] describes permutation modulation codes. Specific instances of spherical codes and permutation modulation codes have useful properties for chip-to-chip communications as shown in Cronie II and Cronie III. In Cronie III, it is shown, for example, that with a type of permutation modulation codes referred to therein as sparse signaling codes, the power consumption of a chip-to-chip communication system may be substantially reduced.

The input 505 to encoder 420 (each period) is a sequence of k bits from source 400 (not shown). This sequence of k bits is split into l parts, where l is at least two, but can be three, four, or more than four. The first part comprises k₀ bits and is forwarded to a first permutation modulation encoder 550(0). The second part comprises k₁ bits and is forwarded to a second permutation modulation encoder 550(1), and so on up to the l-th part comprising k_(l−1) bits forwarded to an l-th permutation modulation encoder 550(l−1). For the purpose of this explanation, it is assumed that at least one of k₀ to k_(l−1) is greater than 1. Note that k=k₀+k₁+ . . . +k_(l−1).

Superposition Signaling Encoding for l=2

The following example is for the two-layer encoder, i.e., where l=2. Extensions to more than two layers are explained later. Where l=2, k=k₀+k₁ and there are two permutation modulation encoders. The first permutation modulation encoder, 550(0), is defined by a basis vector, x₀, that is of dimension N₀. In a preferred embodiment, the vector x₀ has a form wherein there are t subsequences of vector components each having a constant value of some real number in decreasing order, as shown by Equation 2.

$\begin{matrix} \left. {x_{0} = {\left( {\underset{\underset{n_{0}}{︸}}{a_{0},\ldots\;,a_{0}}{\underset{\underset{n_{1}}{︸}}{a_{1},\ldots\;,a_{1}}}\mspace{11mu}\ldots}\;  \right.\underset{\underset{n_{t}}{︸}}{a_{t},\ldots\;,a_{t}}}} \right) & \left( {{Eqn}.\mspace{11mu} 2} \right) \end{matrix}$

In Equation 2, n₀, n₁, . . . , n_(t) are positive integers summing up to N₀, the vector components a₀, a₁, . . . , a_(t) are real numbers such that n₀a₀+n₁a₁+ . . . +n_(t)a_(t)=0 and a₀>a₁> . . . >a_(t). The permutation modulation code defined by x₀ is defined as the set of all permutations of x₀. The elements of the permutation modulation code generated by x₀ can be enumerated by the different partitions of the set {1, 2, . . . , N₀} into subsets of sizes n₀, n₁, . . . , n_(t). Let S₀ denote the signal constellation defined by the set of all permutations of x₀. The size of S₀ is as shown in Equation 3. If |S₀|≧2^(k) ⁰ , then k₀ bits can be uniquely mapped to 2^(k) ⁰ different elements of S₀.

$\begin{matrix} {{S_{0}} = {\frac{N_{0}!}{{n_{0}!}{n_{1}!}\ldots\;{n_{t}!}}.}} & \left( {{Eqn}.\mspace{11mu} 3} \right) \end{matrix}$

Similarly, permutation modulation encoder 550(1) is defined by a basis vector x₁ that is of dimension N₁. In a preferred embodiment, the vector x₁ has a similar form, as shown by Equation 4.

$\begin{matrix} \left. {x_{1} = {\left( {\underset{\underset{m_{0}}{︸}}{b_{0},\ldots\;,b_{0}}{\underset{\underset{m_{1}}{︸}}{b_{1},\ldots\;,b_{1}}}\mspace{11mu}\ldots}\;  \right.\underset{\underset{m_{u}}{︸}}{b_{u},\ldots\;,b_{u}}}} \right) & \left( {{Eqn}.\mspace{11mu} 4} \right) \end{matrix}$

In Equation 4, m₀, m₁, . . . , m_(u) are positive integers summing up to N₁, the vector components b₀, b₁, . . . , b_(u), are real numbers such that m₀b₀+m₁b₁+ . . . +m_(u)b_(u)=0 and b₀>b₁> . . . >b_(u). The permutation modulation code defined by x₁ is defined as the set of all permutations of x₁. The elements of the permutation modulation code generated by x₁ can be enumerated by the different partitions of the set {1, 2, . . . , N₁} into subsets of sizes m₀, m₁, . . . , m_(u). Let s₁ denote the signal constellation defined by the set of all permutations of x₁. The size of s₁ is as shown in Equation 5. If |S₁|2^(k) ¹ , then k₁ bits can be uniquely mapped to 2^(k) ¹ different elements of s₁.

$\begin{matrix} {{S_{0}} = \frac{N_{0}!}{{m_{0}!}{m_{1}!}\ldots\;{m_{u}!}}} & \left( {{Eqn}.\mspace{11mu} 5} \right) \end{matrix}$

In a preferred embodiment, the sizes of x₀ and x₁ are equal, so N₀=N₁. However, in some cases, it may be preferred to use basis vectors that define permutation modulation codes that are not equal in size. The length, n, of a superposition signaling code is defined as the maximum length of any of the basis vectors in the collection of basis vectors used. As an example, consider the following basis vector x=[1 0 0 0 −1] of dimension 5 that may be used by permutation modulation (“PM”) encoders 550(0) and 550(1). There are 20 distinct permutations of x, which allows for any sequence of four bits to be encoded into a unique permutation of x.

In the encoder, the k bits are split by a splitter 510 into the k₀, . . . , k_(l−1) parts that are provided to the l PM encoders 550. Where l=2, splitter 510 provides two parts to PM encoders 550(0) and 550(1) and the permutations of x₀ and x₁ as generated by permutation modulation (“PM”) encoders 550(0) and 550(1) are denoted by P₀(x₀) and P₁(x₁), respectively. The vectors P₀(x₀) and P₁(x₁) are provided to a combine unit 530. In some embodiments, splitter 510 and combine unit 530 comprise actual circuitry, whereas in other embodiments one or both are effected by particular wire connections and/or routing.

In a preferred embodiment, combine unit 530 generates an encoded vector, z, as a superposition, as indicated in Equation 6. z=P ₀(x ₀)+P ₁(x ₁)  (Eqn. 6)

The set S of all z that are generated by the superposition of the permutations of the basis vectors x₀ and x₁ defines a signal constellation. In preferred embodiments, this signal constellation is used to transmit information according to Equation 1. In a preferred embodiment, the basis vectors x₀ and x₁ are chosen in such a way that the map from x₀ and x₁ to an element z of S is one-to-one.

Superposition Signaling Encoding for l>2

The above scheme may be extended to using more than two basis vectors to generate encoded vectors z. In a preferred embodiment, a transmit unit uses three or even more basis vectors. A preferred embodiment where more than two basis vectors are used is exemplified in FIG. 27. Encoder 420 comprises l PM encoders 550. The i-th PM encoder 550(i) is defined by a basis vector x_(i). PM encoder 550(i) takes as its input a set of k_(i) bits and maps these into a permutation of the basis vector x_(i). Let the permutation of the i-th basis vector x_(i) be denoted by P_(i)(x_(i)). These permuted vectors are input to combine unit 530, which outputs signals 528 having n components for n wires (not shown). In certain embodiments, the lengths of the vectors x_(i) may be the same, and combine unit 530 may generate an encoded vector z of the form shown in Equation 7.

$\begin{matrix} {z = {\sum\limits_{i = 0}^{l - 1}{{P_{i}\left( x_{i} \right)}.}}} & \left( {{Eqn}.\mspace{11mu} 7} \right) \end{matrix}$

For the purpose of this disclosure, it is assumed that for at least one of the basis vectors x_(i) more than two permutations of x_(i) are used. The signal constellation S is defined by the set of all possible z and the code it defines is referred to herein as a “superposition signaling code” while the number l is referred to as the number of layers of the superposition signaling code.

First Examples of Superposition Signaling Codes

A first example of a superposition signaling code for transmission on five wires is furnished by choosing the basis vectors x₀=[1 0 0 0 −1] and x₁=[3 0 0 0 −3]. Here, l=2, i.e., the superposition signaling code comprises two layers. The basis vectors x₀ and x₁ are both a scaled version of the vector [1 0 0 0 −1]. There are 20 distinct permutations of this vector. When z is generated according to Equation 6, the mapping from P₀(x₀) and P₁(x₁) to z is one-to-one. The resulting signal constellation S comprises 400 elements. Table 1 gives several elements of the resulting signal constellation S for several permutations of x₀ and x₁. Only a few of the possible constellation points are illustrated, with different “shapes” of z.

TABLE 1 P₀ (x₀) P₁ (x₁) z [−1 1 0 0 0] [−3 3 0 0 0] [−4 4 0 0 0] [1 −1 0 0 0] [−3 3 0 0 0] [−2 2 0 0 0] [−1 0 1 0 0] [−3 3 0 0 0] [−4 3 1 0 0] [−1 0 1 0 0] [3 −3 0 0 0] [2 −3 1 0 0] [1 0 −1 0 0] [−3 3 0 0 0] [−2 3 −1 0 0] [1 0 −1 0 0] [3 −3 0 0 0] [4 −3 −1 0 0] [0 0 1 −1 0] [3 −3 0 0 0] [3 −3 1 −1 0]

The elements of the resulting signal constellation are all permutations of one of the following vectors: [−4 4 0 0 0], [−2 2 0 0 0], [−4 0 0 1 3], [−3 0 0 1 2], [−3 −1 0 0 4] and [−3 −1 0 1 3].

Since the sum of elements of x₀ and x₁ is equal to zero for any point, the sum of the words of S is equal to zero as well. In a preferred embodiment, the transmit unit may use 16 different permutations of x₀ and 16 different permutations of x₁. In that case, k₀=4 and k₁=4 and with the resulting superposition signaling code eight bits may be encoded into a codeword for transmission over a communication bus comprising five wires. This superposition signaling code is referred to as an “8b5w” code. (In the general case, the code is a “kbnw” code indicating that it signals for k bits over n wires each period.)

A second example of a superposition signaling code comprising two layers is furnished by choosing the basis vectors x₀ and x₁ as x₀=[1 1 −1 −1] and x₁=[2 2 −2 −2]. Table 2 gives several examples of elements of the resulting signal constellation S for several permutations of x₀ and x₁.

TABLE 2 P₀ (x₀) P₁ (x₁) z [−1 −1 1 1] [−2 −2 2 2] [−3 −3 3 3] [−1 −1 1 1] [2 2 −2 −2] [1 1 −1 −1] [−1 −1 1 1] [−2 2 −2 2] [−3 1 −1 3] [1 −1 1 −1] [−2 2 −2 2] [−1 1 −1 1] [1 1 −1 −1] [−2 2 −2 2] [−1 3 −3 1]

There exist six different permutations of x₀ and six different permutations of x₁. The resulting superposition signaling code has 36 different elements. These elements are all permutations of one of the following basis vectors: [−3 −3 3 3], [1 1 −1 −1] and [−3 1 −1 3]. Each of these vectors has the property that the sum of all elements is equal to 0. In a preferred embodiment, encoder 420 may use four different permutations of x₀ and four different permutations of x₁ and thus k₀=2 and k₁=2. This allows at least four bits to be encoded into a codeword. This superposition signaling code is referred to as a 4b4w code.

An example of a superposition signaling code comprising three layers is furnished by choosing the basis vectors x₀, x₁ and x₂ as x₀=[1 1 0 0 0 0 −1 −1], x₁=[3 3 0 0 0 0 −3 −3] and x₂=[9 9 0 0 0 0 −9 −9]. Table 3 gives several examples of elements of the resulting signal constellation S.

TABLE 3 P₀ (x₀) P₁ (x₁) P₂ (x₂) z [−1 −1 1 1 0 0 0 0] [−3 −3 3 3 0 0 0 0] [−9 −9 9 9 0 0 0 0] [−13 −13 13 13 0 0 0 0] [0 −1 0 1 0 −1 0 1] [3 3 −3 0 0 −3 0 0] [0 9 −9 0 9 −9 0 0] [3 11 −12 1 9 −13 0 1] [0 1 0 −1 0 1 −1 0] [−3 −3 3 3 0 0 0 0] [9 0 −9 0 0 −9 0 9] [6 −2 −6 2 0 −8 −1 9] [0 1 0 −1 0 1 −1 0] [3 0 0 −3 3 −3 0 0] [9 0 −9 0 0 −9 0 9] [12 1 −9 −4 3 −11 −1 9]

There exist 420 different permutations of the basis vectors x₀, x₁ and x₂. In a preferred embodiment, 256 permutations of each of the vectors x₀, x₁ and x₂ are used, k₀=k₁=k₂=8 and the resulting superposition signaling codes is referred to as a 24b8w code.

Encoding Processes for Superposition Signaling Codes

A first step of encoding a superposition signaling code comprises encoding the basis vectors for individual permutation modulation codes. For the i-th layer of the superposition signaling code, this process amounts to mapping k_(i) bits uniquely to permutations of the basis vector x_(i). This might not be straightforward in general if an encoder has to run at very high speeds and consume very low power. A general approach of encoding a set of k₀ bits into a permutation of x₀ uses an index generator.

FIG. 6 illustrates an encoder that uses an index generator. As shown there, a PM encoder 815 comprises inputs 810, an index generator 820, a storage unit 830 having storage elements 840, and outputs 850. Storage elements 840 can be capacitive storage, inductive storage, circuit storage, memory or other suitable circuit elements to store index values. PM encoder 815 might be used an encoder 550 of FIG. 5.

A general operation of PM encoder 815 will be explained with reference to one basis vector x₀ but by extension can be repeated for the other layers of the superposition signaling code. In FIG. 6, the number of bits in is k and the number of signals out is n, but since PM encoder 815 is for one layer (multiple instances are used for other layers), it handles k₀ bits and is defined by a basis vector x₀ of size n that has the form shown in Equation 8.

$\begin{matrix} \left. {x_{0} = {\left( {\underset{\underset{n_{0}}{︸}}{a_{0},\ldots\;,a_{0}}{\underset{\underset{n_{1}}{︸}}{a_{1},\ldots\;,a_{1}}}\mspace{11mu}\ldots}\;  \right.\underset{\underset{n_{t}}{︸}}{a_{t},\ldots\;,a_{t}}}} \right) & \left( {{Eqn}.\mspace{11mu} 8} \right) \end{matrix}$

Input 810 of PM encoder 815 comprises k₀ bits b₀, . . . , b_(k0−1) and a task of PM encoder 815 is to generate a permutation of x₀ denoted by P₀(x₀). PM encoder 815 comprises an index generator 820 whose task it is to generate a set of N indices, where N is the sum of the components of x₀, as given by Equation 9.

$\begin{matrix} {N = {\sum\limits_{i = 0}^{t - 1}n_{i}}} & \left( {{Eqn}.\mspace{11mu} 9} \right) \end{matrix}$

Index generator 820 generates n₀ indices that identify the position of the a₀'s in the permutation of P₀(x₀), n₁ indices that identify the position of the a₁'s in P₀(x₀), etc. Indices need only be generated for a₀, . . . , a_(t−1) since P₀(x₀) is completely defined by these indices.

The set of these N indices is denoted by i₀, . . . , i_(N−1). PM encoder 815 may comprise storage unit 830 that contains n storage elements 840. Each of these storage elements 840 is able to store at least the different positions of x₀. The values of storage elements 840 are initialized by the values of the a_(t). The indices i₀, . . . , i_(N−1) are used to set the corresponding storage elements 840 to the value that corresponds to the index. In this way, a permutation of x₀ is available in storage unit 830. The output of PM encoder 815 is a set of n signals 850 that are denoted by s₀, . . . , s_(n−1). Each of the output signals 850 is connected to one of the storage elements 840 in storage unit 830.

The value of the output signals 850 is a physical representation of the value in the corresponding storage element 840. There are other ways to specify an encoder for a permutation modulation code. The encoder exemplified in FIG. 6 is meant only as an illustration for such a specification. In practice, a simple process is preferred for generating the indices i₀, . . . , i_(N−1). Several such processes for several basis vectors x₀ are disclosed below.

The 4b5w Code

Consider the case where one or more layers of the superposition signaling code is defined by a permutation modulation code that is defined by the basis vector x=[1 0 0 0 −1]. This vector may be scaled by an arbitrary constant. For instance, the two layers of the 8b5w code can be defined by two scaled versions of x. In a preferred embodiment, 16 different permutations of this basis vector are used. This allows the encoder to encode four bits into a permutation of x. This is referred to herein as the 4b5w code.

FIG. 7 is a flowchart of an encoding process that may be used by index generator 820 to define a permutation of x. There, the permutation of x is denoted by P(x). The process might be performed by an encoder. As indicated, from the input (four bits, b₀, . . . , b₃) 910, two indices, i₀ and i₁ are generated. The index i₀ may indicate the position of the −1 in P(x) and the index i₁ may indicate the position of the 1 in P(x).

At step 920, the bits are split in two pairs and represented by their integer representations, t₀ and t₁. At step 921, the two integers t₀ and t₁ are compared (921). If they are equal, the index i₀ is set to t₀ and the index i₁ is set to four (922). If t₀ and t₁ are not equal, the index i₀ is set to t₀ and the index i₁ is set to t₁ (930). The output of the process comprises the indices i₀ and i₁. Storage elements 840 of the encoder may be initialized by a value of 0 and the indices i₀ and i₁ may be used to set the i₀-th storage element of 840 to −1 and to set the i₁-th storage element to 1. This results in the mapping from bits to permutation of x as shown in Table 4.

TABLE 4 b₀, b₁, b₂, b₃ P (x) 0, 0, 0, 0 [−1, 0, 0, 0, 1] 0, 0, 0, 1 [−1, 1, 0, 0, 0] 0, 0, 1, 0 [−1, 0, 1, 0, 0] 0, 0, 1, 1 [−1, 0, 0, 1, 0] 0, 1, 0, 0 [1, −1, 0, 0, 1] 0, 1, 0, 1 [0, −1, 0, 0, 1] 0, 1, 1, 0 [0, −1, 1, 0, 0] 0, 1, 1, 1 [0, −1, 0, 1, 0] 1, 0, 0, 0 [1, 0, −1, 0, 0] 1, 0, 0, 1 [0, 1, −1, 0, 0] 1, 0, 1, 0 [0, 0, −1, 0, 1] 1, 0, 1, 1 [0, 0, −1, 1, 0] 1, 1, 0, 0 [1, 0, 0, −1, 0] 1, 1, 0, 1 [0, 1, 0, −1, 0] 1, 1, 1, 0 [0, 0, 1, −1, 0] 1, 1, 1, 1 [0, 0, 0, −1, 1] Extensions of the 4b5w Code

The 4b5w code described above can be generalized to a larger family of permutation modulation codes with similar properties. This family of permutation modulation codes is defined by a basis vector of the following form shown in Equation 10.

$\begin{matrix} {x = \begin{bmatrix} \; \\ {{1\mspace{14mu}\underset{\underset{2^{k} - 1}{︸}}{{0\mspace{14mu}\ldots\mspace{11mu} 0}\;}}\mspace{11mu} - 1} \end{bmatrix}} & \left( {{Eqn}.\mspace{11mu} 10} \right) \end{matrix}$

The dimension of the basis vector x is 2^(k)+1 and it contains a single 1, a single −1 and 2^(k)−1 zeros. The 4b5w code is a specific example for k=2. The basis vector x given by Equation 10 allows the encoder to map 2k bits into a permutation of x.

FIG. 8 is a flowchart of an encoding process that may be used by an index generator to define a permutation of a basis vector with the more general basis vector. As illustrated there, inputs 1010 are a set of 2k bits, denoted b₀, . . . , b_(2k−1). The output of the process is the two integers i₀ and i₁. In step 1020, two indices t₀ and t₁ are formed based on the input bits b₀, . . . , b_(2k−1). In step 1021, the indices t₀ and t₁ are compared. If they are equal, the output integers i₀ and i₁ are set according to step 1022. If they are not equal, the output integers i₀ and i₁ are set according to step 1030. Storage elements 840 may be initialized by a value of 0 and the indices i₀ and i₁ may be used to set the i₀-th storage element of 840 to −1 and to set the i₁-th storage element to 1 to finalize the encoding of the k bits into a permutation of x.

The 8b8w Code

Suppose one or more layers of the superposition signaling code is defined by a PM code that is defined by the basis vector x=[1 1 0 0 0 0 −1 −1]. Note that this vector may be scaled by an arbitrary constant. The three layers of the 24b8w code are defined by three scaled versions of x. In the example above, 256 different permutations of x are used for the 8b8w code.

FIG. 9 is a flowchart of an encoding process that may be used by an index generator to define a permutation of a basis vector for an 8b8w code. That process might be implemented by index generator 820 to generate four indices i₀, . . . , i₃ that correspond to the positions of the 1's and the −1's in the permutation of x. The inputs 1110 to the process are the bits b₀, . . . , b₇ and the outputs 1190 of the encoding process are the four indices i₀, i₁, i₂ and i₃. The indices i₀ and i₂ are equal to the positions of the −1's in a vector that corresponds to the permutation of the basis vector x₀ and the indices i₁ and i₃ are equal to the positions of the 1's. In step 1120, four indices t₀, . . . , t₃ are formed from the eight input bits in 1110. The way this is done is by converting pairs of input bits to their decimal representation, i.e., b₀ and b₁ are converted to 2b₀+b₁. In the process exemplified in FIG. 9, four different cases can be distinguished.

The first case is tested for in step 1121 and occurs when t₀=t₁ and t₂=t₃. When this case is true, the indices i₀, . . . , i₃ are set according to the calculations performed in step 1122.

The second case is tested for in step 1123 and occurs when t₀=t₁ and t₂≠t₃. When this case is true, indices i₀, . . . , i₃ are set according to the calculations performed in step 1124. In step 1124, the set A is formed as follows. Initially, the set A contains the integers 0, 1, 2 and 3. From this set, the integers t₂ and t₃ are removed. The smallest element of the remaining set is denoted by A [0] and the largest element of the remaining set is denoted by A[1].

The third case is tested for in step 1125 and occurs when t₀≠t₁ and t₂=t₃. When this case is true, indices i₀, . . . , i₃ are set according to the calculations performed in step 1126.

The fourth case is the default case and occurs when t₀≠t₁ and t₂≠t₃. When this case occurs, the indices i₀, . . . , i₃ are set according to step 1127. The output of the process comprises the indices i₀, i₁, i₂, i₃.

The storage elements 840 may be initialized by a value of 0 and the indices i₀ and i₂ may be used to set the i₀-th and i₂-th storage element of 840 to −1 and to set the i₁-th and i₃-th storage element to 1.

The resulting map from bits b₀, . . . , b₇ to indices i₀, i₁, i₂, i₃ is illustrated in FIG. 10, which shows a map from the decimal representation of b₀, . . . , b₇ to the indices i₀, . . . , i₃.

In a more general case, the problem is to find efficient processes for encoding, or else any gains in transmission power consumption might be lost to increases in computing power consumption. This above example is specific to mapping 8 input bits to one of 256 permutations of a vector of the form [a a b b b b c c] and is relatively simple to implement in hardware. The equality tests break up the encoding task into disjoint sub-problems and for each of these sub-problems, encoding can be performed such that the overall mapping is one-to-one.

Examples of Construction of Superposition Signaling Codes

In the above examples, each permutation modulation code of the superposition signaling code is encoded separately and these layers are then combined to obtain a codeword of the superposition signaling code.

FIG. 11 is a block diagram of an encoder for the 8b5w code. The encoder comprises two index generators 1320, 1322. The inputs 1310 to the first index generator 1320 are the bits b₀, . . . , b₃. The index generator 1320 may implement the process of FIG. 7 to map the bits b₀, . . . , b₃ to a permutation of the basis vector x₀ that defines the first layer of the 8b5w superposition signaling code. This basis vector may be given by x₀=[1 0 0 0 −1]. Storage elements 1325 of a storage device 1324 are initially set to 0. The indices generated by index generator 1320 are used to set the corresponding storage elements 1325 of storage device 1324. The outputs 1330 of storage device 1324 are denoted by v₀, . . . , v₄ and the i-th of these outputs 1330 may be a physical representation of the i-th storage element in storage device 1324.

The inputs 1312 to the second index generator 1322 are the bits b₄, . . . , b₇. The index generator 1322 may implement the process of FIG. 7 to map the bits b₄, . . . , b₇ to a permutation of the basis vector defining the second layer of the 8b5w superposition signaling code. This basis vector may be given by x₁=[3 0 0 0 −3]. Storage elements 1327 of a storage device 1326 are initially set to 0. The indices generated by index generator 1322 are used to set the corresponding storage elements 1327 of storage device 1326. The outputs 1332 of storage device 1326 are denoted by w₁, . . . , w₄ and the i-th of these outputs 1332 may be a physical representation of the i-th storage element in storage device 1326.

The output of storage device 1324 and the output of storage device 1326 are the input of a combine unit 1340. Combine unit 1340 generates its outputs 1350, which are denoted by c₀, . . . , c₄ as the vector addition of v₀, . . . , v₄ and w₁, . . . , w₄ as illustrated in Equation 11.

$\begin{matrix} {\begin{bmatrix} c_{0} \\ c_{1} \\ c_{2} \\ c_{3} \\ c_{4} \end{bmatrix} = {\begin{bmatrix} v_{0} \\ v_{1} \\ v_{2} \\ v_{3} \\ v_{4} \end{bmatrix} + \begin{bmatrix} w_{0} \\ w_{1} \\ w_{2} \\ w_{3} \\ w_{4} \end{bmatrix}}} & \left( {{Eqn}.\mspace{11mu} 11} \right) \end{matrix}$

The addition is performed by addition units 1342.

FIG. 12 shows the counterpart of an encoder for the 24b8w code. The encoder comprises three index generators 1420, 1422, 1424. The inputs 1410 to the first index generator 1420 are the bits b₀, . . . , b₇. The index generator 1420 may implement the process of FIG. 9 to generate four indices corresponding to the non-zero position of the basis vector x₀ that for the first layer of the 24b8w code is given by x₀=[1 1 0 0 0 0 −1 −1].

The inputs 1412 to the second index generator 1422 are the bits b₈, . . . , b₁₅. The index generator 1422 may implement the process of FIG. 9 to generate four indices corresponding to the non-zero position of the basis vector x₁ that for the second layer of the 24b8w code is given by x₁=[3 3 0 0 0 0 −3 −3].

The inputs 1414 to the third index generator 1424 are the bits b₁₆, . . . , b₂₃. The index generator 1424 may implement the process of FIG. 9 to generate four indices corresponding to the non-zero position of the basis vector x₂ that for the third layer of the 24b8w code is given by x₂=[9 9 0 0 0 0 −9 −9].

The outputs of each index generator 1420, 1422, 1424 is sent to a corresponding storage device 1430, 1432, 1434. Each of the storage elements of these storage devices 1430, 1432, 1434 is initialized with a value of 0. The storage elements corresponding to the indices received from the corresponding index generators 1420, 1422, 1424 are set to the non-zero values of the corresponding basis vector. The output of storage device 1430 is denoted by v₁, . . . , v₇, the output of storage device 1432 is denoted by w₁, . . . , w₇ and the output of storage device 1434 is denoted by u₁, . . . , u₇. These outputs are input to a combine unit 1440 that generates the outputs c₀, . . . , c₇ as illustrated in Equation 12 as the vector addition of the outputs of the three storage devices.

$\begin{matrix} {\begin{bmatrix} c_{0} \\ c_{1} \\ c_{2} \\ c_{3} \\ c_{4} \\ c_{5} \\ c_{6} \\ c_{7} \end{bmatrix} = {\begin{bmatrix} v_{0} \\ v_{1} \\ v_{2} \\ v_{3} \\ v_{4} \\ v_{5} \\ v_{6} \\ v_{7} \end{bmatrix} + \begin{bmatrix} w_{0} \\ w_{1} \\ w_{2} \\ w_{3} \\ w_{4} \\ w_{5} \\ w_{6} \\ w_{7} \end{bmatrix} + \begin{bmatrix} u_{0} \\ u_{1} \\ u_{2} \\ u_{3} \\ u_{4} \\ u_{5} \\ u_{6} \\ u_{7} \end{bmatrix}}} & \left( {{Eqn}.\mspace{14mu} 12} \right) \end{matrix}$ The 8b8w code and the 24b8w code are members of a family of superposition signaling codes. The members of this family are the 8lb8w codes, where l denotes the number of layers. The basis vector for the i-th layer is given by [3^(i) 3^(i) 0 0 0 0 −(3¹) −(3¹)] where the first layer is indexed for i=0. For these codes the architecture of FIG. 12 may be extended to comprise l index generators, l storage devices and a combine unit that can handle l inputs. Drivers and Combine Units Voltage-Mode and Current-Mode Drivers

FIG. 13 illustrates an example circuit that might be used for driver 430 in the circuit of FIG. 3. In this example, the signals are voltage signals. Thus, for a superposition signaling code of length n, the wires of the communication bus are driven in voltage mode. Inputs 1510 to the driver are the numbers c₀, . . . , c_(n−1). These numbers may be represented in some physical form, such as a current or voltage in an electronic circuit. In a bus communication system, a set of n of these numbers may be available every T seconds and these numbers may represent a codeword from a superposition signaling code. The driver may comprise a set of n controlled voltage sources 1520. Each of the voltage sources 1520 is connected to one of the outputs 1530. Outputs 1530 are denoted by s₀(t), . . . , s_(n−1)(t). Each of the voltage sources 1520 may generate a pulse shape, p(t), that may be modulated according to inputs 1510. The i-th voltage source may generate the i-th output as s_(i)(t)=c_(i)p(t). As should be apparent upon reading this disclosure, the driver of FIG. 13 drives the wires according to Equation 1.

Every interval of T seconds, a new codeword of the superposition signaling may be available and the driver may change its outputs 1530 accordingly. Since the driver comprises voltage sources 1520, the output signals s₀(t), . . . , s_(n−1)(t) define the voltage on the wires.

The driver may include other functions such as equalization, amplification and crosstalk cancellation. Furthermore, the driver may terminate the bus in its characteristic impedance.

In another embodiment, the wires of the communication bus are driven in current-mode. In such an embodiment of the driver, the controlled voltage sources 1520 of FIG. 13 might be omitted and instead may appear controlled current sources.

This is illustrated in FIG. 14, wherein inputs 1610 of the driver may be used to control the controlled current sources 1620. In this case, outputs 1630 of the driver (denoted by s₀(t), . . . , s_(n−1)(t)) define the currents on the wires of the communication bus.

The currents generated by current-sources 1620 may be positive or negative. In a practical setting, it may be difficult to generate currents with both positive and negative values. Separate current-sources may be used for the positive and negative currents to deal with this situation.

FIG. 15 is a block diagram of a circuit that might be used. As shown there, inputs 1710 to the driver are denoted by c₀, . . . , c_(n−1) and denote a representation of the codeword from the superposition signaling code. A unit 1715 generates two sets of numbers from c₀, . . . , c_(n−1). The first set of these numbers 1720 is denoted by I₀, . . . , I_(n−1) and is generated by unit 1715 according to Equation 13.

$\begin{matrix} {I_{i} = \frac{{c_{i}} + c_{i}}{2}} & \left( {{Eqn}.\mspace{11mu} 13} \right) \end{matrix}$

The second set of these numbers is denoted by J₀, . . . , J_(n−1) and is generated by unit 1715 according to Equation 14.

$\begin{matrix} {J_{i} = \frac{{c_{i}} - c_{i}}{2}} & \left( {{Eqn}.\mspace{11mu} 14} \right) \end{matrix}$

The numbers I₀, . . . , I_(n−1) correspond to the absolute values of the positive values of c₀, c_(n−1). The numbers J₀, . . . , J_(n−1) correspond to the absolute values of the negative values of c₀, . . . , c_(n−1). The driver may comprise n controlled current-sources 1740. The i-th of these controlled current sources 1740 sources a current of size I_(i) into the i-th wire. The current is sourced from the positive terminal of a voltage source 1730 that has value Vdd. Furthermore, the driver may comprise n controlled current-sources 1742. The i-th of these controlled current sources 1742 sinks a current of size from the i-th wire. The current is sunk into the negative terminal of voltage source 1730. The negative terminal of voltage source 1730 may be connected to ground.

FIG. 16 is an illustration of examples for signals, to provide a better understanding of the signals that are driven on the communication bus. The pulse shape, p(t), may be chosen as pulse shape 520 (see FIG. 4(b)). Consider the case where the time interval, T, is 200 picoseconds. The encoder may implement the 8b5w superposition signaling code, in which case the signals s₀(t), . . . , s_(n−1)(t) that are driven on the five wires of the communication bus may have the form as shown in FIG. 16.

FIG. 16 shows the waveforms s₀(t), . . . , s_(n−1)(t) for several time intervals of 200 picoseconds each. In each time interval, eight (possibly) arbitrary bits are encoded into an element of the 8b5w code. The driver may generate the signals as shown in FIG. 15 for transmission over the bus. The signals shown in FIG. 16 may represent voltages and/or currents.

The combine units of the encoder may be implemented as summers for physical addition of physical quantities. In some cases, the combine unit may be omitted from the encoder itself or implemented by the existence of particular wires. One advantage of this is that no additional hardware is required to perform the actual operation performed by the combine unit.

FIG. 17 illustrates a variation of a transmit unit wherein separate drivers appear prior to a combine unit. In this embodiment, a transmit unit does not include an explicit combine unit. Instead, the transmit unit encodes using a superposition signaling code that comprises two layers. The transmit unit may comprise two encoders 1920, 1922 for the two permutation modulation codes defining the two layers of the superposition signaling code. The result of the encoder 1920 is input to a driver 1930 and the result of the encoder 1922 is input to a driver 1932. Both drivers 1930 and 1932 may drive wires of a communication bus independently. The actual combining is performed in combine unit 1940 and may be physical in nature.

Note that the addition of two currents that are sourced into the same conductor, the addition or averaging of voltages induced in a network comprising passive resistors, or the like will effect an addition or summation without explicit active circuit elements. The example of FIG. 17 can be easily extended to any number of layers of the superposition signaling code.

FIG. 18 illustrates a transmit unit that might be used as transmit unit 410 (see FIG. 3) and implements the 8b5w code with combining performed by a physical addition. The transmit unit comprises two encoders 2020, 2024. The inputs 2010 to the first encoder 2020 are the bits b₀, . . . , b₃. These bits may be encoded into a permutation of the basis vector x₀=[1 0 0 0 −1]. For this purpose, the encoder 2020 may comprise an index generator 2022 and storage device 2030. The index generator 2022 may implement the process as exemplified in FIG. 7 to generate two indices. These two indices may be used to set the non-zero values corresponding to the permutation of x₀ in storage device 2030. The values stored and/or buffered in the storage device 2020 may be used to control the current that is sourced by the controlled current sources 2040 into the wires 445 of the communication bus 440.

The inputs 2012 to the second encoder 2024 are the bits b₄, . . . , b₇. These bits may be encoded into a permutation of the basis vector x₁=[3 0 0 0 −3]. The encoder 2024 may comprise an index generator 2026 and storage device 2032. The index generator 2026 may implement the process as exemplified in FIG. 7 to generate two indices. These two indices may be used to set the non-zero values corresponding to the permutation of x₁ in storage device 2032. The values stored and/or buffered in the storage device 2032 may be used to control the current that is sourced by the controlled current sources 2042 into the wires 445 of the communication bus 440.

The final operation of encoding the 8b5w code comprises combining the permutations of x₀ and x₁. In the embodiment of FIG. 18, the operation of combining is performed in nodes 2050 where the currents of controlled current sources 2040, 2042 are superimposed to create signals transmitted on the bus according to the 8b5w code.

Signal-to-Digital Converter (“SDC”)

FIG. 19 illustrates an example Signal-to-Digital Converter (“SDC”) as might be used for SDC 460 in the circuit of FIG. 3. A task of a SDC is to sense the signals on the bus and generate a digital representation for the word of the superposition signaling code that is transmitted on the communication bus.

As shown in FIG. 19, an SDC has inputs 2110 denoted by y₀(t), . . . , y_(n−1)(t). The SDC may comprise n analog-to-digital converters (ADCs) 2120 where the i-th ADC samples the i-th signal y_(i)(t). The SDC may sample the signals y₀(t), . . . , y_(n−1)(t) at the optimal sample moment and generate n numbers that are denoted by d₀, . . . , d_(n−1). The numbers d₀, . . . , d_(n−1) may be a discrete estimate of the transmitted word from the superposition code. The resolution of the ADCs 2120 may be adjusted to the superposition signaling code used.

When, for instance, the number of different levels observed at each of the wires is equal to M, a decoder of a receive unit may choose the resolution of the ADC as a least log₂(M) bits. The SDC may also perform additional tasks such as amplification, filtering, equalization and clock recovery or any other functions.

The SDC may employ a joint conversion architecture across the wires to generate the numbers d₀, . . . , d_(n−1). Such a joint conversion architecture may lead to less hardware complexity and lower power consumption.

Decoding Superposition Signaling Codes

The input to a decoder can be the set of n numbers d₀, . . . , d_(n−1) which are output by an SDC. The numbers d₀, . . . , d_(n−1) may represent an estimate for the transmitted word from the superposition signaling code.

FIG. 20 illustrates an example decoder as might be used for decoder 470 in the circuit of FIG. 3. The inputs 2210 are used as an index in a look-up-table (“LUT”) 2230 to perform decoding of the superposition signaling code. The LUT stores, for each input combination of d₀, . . . , d_(n−1), the corresponding set of original information bits b₀, . . . , b_(n−1). The outputs 2220 of the decoder may comprise these information bits.

The use of a LUT may not always be practical when the total number of codewords is large. To reduce complexity, the decoder might take advantage of the properties of the superposition signaling code and the underlying permutation modulation codes.

A preferred embodiment of a decoder that uses the properties of the underlying permutation modulation codes is shown in FIG. 21. The decoder of FIG. 21 may be used for a superposition signaling code comprising two layers. The inputs 2310 to the decoder are denoted by d₀, . . . , d_(n−1). The decoder comprises two split units 2320, 2322 whose inputs are d₀, . . . , d_(n−1). Split unit 2320 extracts from d₀, . . . , d_(n−1) information required to decode the first layer of the superposition signaling code, which is used by a PM decoder unit 2330. PM decoder unit 2330 decodes the permutation modulation code that defines the first layer of the superposition signaling code. Outputs 2340 of PM decoder unit 2330 comprise the k₁−1 decoded bits b₀, . . . , b_(k1−1). Split unit 2322 extracts from d₀, . . . , d_(n−1) information required to decode the second layer of the superposition signaling code, which is used by a PM decoder unit 2332. The PM decoder unit 2332 decodes the permutation modulation code that defines the second layer of the superposition signaling code. The outputs 2342 of decoder unit 2332 comprise the k₂−1 decoded bits b_(k1), . . . , b_(k1+k2−1).

FIG. 22 illustrates a decoder for a 8b5w superposition signaling code. The input of the decoder comprises the five numbers d₀, . . . , d₄. A split unit 2420 generates five numbers v₀, . . . , v₄, wherein as v_(i)=d_(i) (mod 3) for i=0, 1, 2, 3, 4, wherein the operation “(mod 3)” maps its argument, x, to the unique integer a in {−1, 0, 1} such that x−a is divisible by 3. In other words, the split unit returns v_(i)=0 if d_(i) is a multiple of three, v_(i)=−1 if d_(i) is one less than a multiple of three, and v_(i)=1 if d_(i) is one more than a multiple of three.

The numbers v₀, . . . , v₄ provide an estimate for the word of the permutation modulation code of the first layer of the 8b5w code. A PM decoder 2430 takes as its input the numbers v₀, . . . , v₄ and generates the bits b₀, . . . , b₃ as its output 2440. Split unit 2422 generates the five numbers w₀, . . . , w₄ as shown by Equation 15, wherein the round(x) operation denotes the closest integer to x.

$\begin{matrix} {{w_{i} = {{3 \cdot \text{round}}\left( \frac{x}{3} \right)}},{i = 0},\ldots\mspace{11mu},4.} & \left( {{Eqn}.\mspace{11mu} 15} \right) \end{matrix}$

The numbers w₀, . . . , W₄ provide an estimate for the word of the permutation modulation code of the second layer of the 8b5w code. A PM decoder 2432 takes as its input the numbers w₀, . . . , w₄ and generates the bits b₄, . . . , b₇ as its output 2442. The outputs 2440, 2442 are the outputs of the decoder. PM decoders 2430, 2432 may implement the inverse of the process that is exemplified in FIG. 8 to decode the permutation modulation code.

FIG. 22 is an example for the 8b5w code. Similar principles may be applicable to decoders of other superposition signaling codes.

Some Advantages of Superposition Signaling Codes for Chip-to-Chip Communications

The use of superposition signaling codes in chip-to-chip communications provides several advantages. First, superposition signaling codes provide resilience against common-mode noise and interference and the introduction of SSO noise may be minimized by a driver architecture that is matched to the superposition signaling code. Second, the pin-efficiency of superposition signaling codes may be several times larger than that of binary differential signaling. Third, the required transmission power burned by the drivers to allow for reliable communication is much less than competing signaling methods that achieve a pin-efficiency greater than r=0.5.

Noise Resilience

Several sources of noise in chip-to-chip communications have a component that is common on all wires of the communication bus. An example of such type of noise is interference and crosstalk that couples in from neighboring wires that are not part of the communication bus itself. Superposition signaling codes have resilience against common-mode noise and interference. Note that the inputs of the communication bus are generated according to Equation 1 and in case of common-mode noise and/or interference the received signals y₀(t), . . . , y_(n−1)(t) can be written as in Equation 16.

$\begin{matrix} {\begin{bmatrix} {y_{0}(t)} \\ \vdots \\ {y_{n - 1}(t)} \end{bmatrix} = {{{zp}(t)} + \begin{bmatrix} {n_{c}(t)} \\ \vdots \\ {n_{c}(t)} \end{bmatrix}}} & \left( {{Eqn}.\mspace{11mu} 16} \right) \end{matrix}$

In Equation 16, the common-mode noise is denoted by n_(c)(t). Effects such as frequency dependent attenuation might also be taken into account. However, these effects can be included and do not change the resilience against common-mode noise and interference. Since the codewords of a superposition signaling code sum all to zero, a set of signals y′₀(t), . . . , y′_(n−1)(t) can be formed as in Equation 17.

$\begin{matrix} {\begin{bmatrix} {y_{0}^{\prime}(t)} \\ \vdots \\ {y_{n - 1}^{\prime}(t)} \end{bmatrix} = {{\begin{bmatrix} {y_{0}(t)} \\ \vdots \\ {y_{n - 1}(t)} \end{bmatrix} - {\frac{1}{n}{\sum\limits_{i = 0}^{n - 1}{y_{i}(t)}}}} = {{zp}(t)}}} & \left( {{Eqn}.\mspace{11mu} 17} \right) \end{matrix}$

For the set of signals y′₀(t), . . . , y′_(n−1)(t), the common-mode noise is cancelled.

This type of common-mode noise cancellation may be performed explicitly. Such an embodiment is illustrated in FIG. 23. FIG. 23 exemplifies an SDC that incorporates a common-mode cancellation unit 2510. The inputs 2520 of the common-mode cancellation unit 2510 are summed and normalized by unit 2530. The result of the summation and normalization is sent to units 2540 that cancel the common-mode from each of the inputs 2520. The result is fed to an analog-to-digital converter (ADC) unit 2550 that performs the final stage of signal-to-digital conversion.

In a preferred embodiment, the ADC is designed in such a way that the specifics of the superposition signaling code are taken into account. The resolution of the ADC may, for instance, be matched to the superposition signaling code used. Furthermore, the ADC may be designed such that it performs a joint conversion across the wires. One may also employ a receiver architecture to detect the codeword of the superposition signaling code that is not sensitive to common-mode noise. Such receiver architecture implicitly cancels the common-mode noise and interference. Superposition signaling codes have an intrinsic resilience against common-mode noise and interference that may be exploited in hardware architectures.

A second advantage is that the intrinsic common-mode resilience of superposition signaling codes allows the transmitted word to be detected without a reference at the receiver. The reference may be generated by the received signals themselves. Note that unit 2530 in the circuit shown in FIG. 23 generates such a reference from the input signals 2520.

A major problem with many signaling schemes for chip-to-chip communication is the introduction of SSO noise. Superposition signaling codes allow for architectures that minimize the introduction of SSO noise. A transmit unit that accomplishes this for a superposition signaling code comprising two layers is shown in FIG. 24. The transmit unit of FIG. 24 transmits on n wires 2645. The transmit unit generates codewords from a superposition signaling code where the basis vectors x₀ and x₁ are of size n. The transmit unit comprises an encoder 2620 that encodes information into a permutation of a basis vector x₀ and an encoder 2622 that encodes information into a permutation of a basis vector x₁. The result of encoder 2620 is forwarded to a driver unit 2630 that generates a sequence of signals v₀(t), . . . , v_(n−1)(t). These signals may correspond to currents or voltages and are forwarded to a combine unit 2640.

The result of encoder 2622 is forwarded to a driver unit 2632 that generates a sequence of signals w₀(t), . . . , w_(n−1)(t). These signals may correspond to currents or voltages and are forwarded to the combine unit 2640. The combine unit 2640 generates a sequences of signals s₀(t), . . . , s_(n−1)(t) by combining the signals v_(i)(t) and with w_(i)(t) according to Equation 18. s _(i)(t)=v _(i)(t)+w _(i)(t), for i=0, . . . , n−1  (Eqn. 18)

As explained above in this disclosure, the combine unit may make use of the physical addition of signals v_(i)(t) and with w_(i)(t) to minimize hardware complexity.

The encoder 2620 and driver 2630 are both connected to Vdd and ground by parasitic inductors 2610 and 2612, respectively. The driver 2630 generates signals on the wires of the bus that correspond to permutations of x₀ and these signals may be proportional to the currents through parasitic inductors 2610 and 2612. Since the sum of x₀ is equal to zero, the variation of currents through parasitic inductors 2610 and 2612 is minimized. The encoder 2622 and driver 2632 are both connected to Vdd and ground by parasitic inductors 2614 and 2616, respectively. The variation of currents through parasitic inductors 2614 and 2616 is minimized also. This greatly reduces SSO.

Embodiments of drivers shown in FIGS. 13-15 may be used as the drivers 2630, 2632 and the resulting transmit unit may have a good SSO resilience.

Pin-Efficiency

In chip-to-chip communication systems, it is beneficial to increase the pin-efficiency. This has several advantages. First, for a target data rate, the symbol rate and/or number of wires can be lowered. In practice this often translates to lower cost of package and PCB. Second, a chip-to-chip communication system with a higher pin-efficiency can transmit at a lower signaling rate to achieve a predetermined aggregate bandwidth compared to a system that has a lower pin-efficiency. Being able to lower the signaling rate relaxes equalization, which in turn leads to power savings in terms of circuitry required for equalization.

A conventional method to increase the pin-efficiency is to use multi-level modulation. Multi-level modulation is relatively straightforward to combine with differential signaling but the downside is that a substantial amount of transmit energy is required as is shown below. Furthermore, at the receiver the detection of the different levels requires more complex circuitry that increases power consumption also. Superposition signaling codes allow one to achieve a high pin-efficiency while keeping the transmit power consumption low. Furthermore, power may be saved in the circuitry for detection of superposition signaling codes compared to the detection of multi-level signaling with multiple differential links. Table 5 lists the pin-efficiency of several superposition signaling codes comprising two layers. The table also shows the number of different levels that are observed per wire. For comparison the number of levels used for multi-level differential signaling to achieve at least the same pin-efficiency r is shown as well.

TABLE 5 Multilevel n x₀ x₁ log₂(S) r Levels/wire DS levels 4 [−1 −1 1 1] [−2 −2 2 2] 5.17 1.29 4 6 5 [−1 1 0 0 0] [−3 3 0 0 0] 8.64 1.73 9 11 6 [−1 −1 −1 1 1 1] [−2 −2 −2 2 2 2] 8.64 1.44 4 8 8 [−1 −1 1 1 0 0 0 0] [−3 −3 3 3 0 0 0 0] 17.43 2.18 9 21 9 [−1 1 0 0 0 0 0 0 0] [−3 3 0 0 0 0 0 0 0] 12.34 1.37 9 7

As is apparent from Table 5, pin-efficiencies that are substantially larger than 0.5 and even larger than 1 are possible. Furthermore, for most superposition signaling codes, the number of levels observed at each wire is less than would be the case to achieve the same pin-efficiency with differential signaling and multilevel signaling. The only superposition code shown in Table 5 that results in more levels per wire also turns out to be very power-efficient.

To increase the pin-efficiency even further, one may use more than two layers. Table 6 illustrates some examples of superposition signaling codes where three layers are used.

TABLE 6 Levels/ Multilevel n x₀ x₁ x₂ log₂(S) r wire DS levels 5 [−1 1 0 0 0] [−3 3 0 0 0] [−9 9 0 0 0] 12.97 2.59 27 37 8 [−1 −1 1 1 0 0 0 0] [−3 −3 3 3 0 0 0 0] [−9 −9 9 9 0 0 0 0] 26.14 3.27 27 93 9 [−1 1 0 0 0 0 0 0 0] [−3 3 0 0 0 0 0 0 0] [−9 9 0 0 0 0 0 0 0] 18.51 2.06 27 18

As is apparent from Table 6, a transmit unit can obtain pin-efficiencies that are substantially larger than 1 when three layers are used. Furthermore, for most superposition signaling codes, the number of levels observed at each wire is less than would be the case to achieve the same pin-efficiency with differential signaling and multilevel signaling. The only superposition code that results in more levels per wire turns out to be very power-efficient. Superposition signaling codes allow achieving pin-efficiencies that are as high as multi-level differential signaling. However, as will be shown next, superposition signaling codes are substantially more power-efficient in terms of the transmission power required for a certain noise margin. Furthermore, efficient receiver architectures may be implemented that use the properties of the superposition signaling code to achieve additional power savings.

Power Consumption and Noise Margin

Superposition signaling codes allow for a large reduction in power consumption of the drivers compared to conventional signaling methods such as differential signaling. To illustrate this, a general embodiment for a driver is considered that may be used to implement differential signaling, multi-level differential signaling, schemes based on superposition modulation codes and some conventional schemes. To compare the different signaling schemes, it is assumed that they operate at the same noise margin. There are several ways to define the noise margin and for the purpose of this disclosure it is assumed that the minimum separation between the levels observed at each of the wires is normalized to one.

Driver Embodiments for Driver Power Comparison

A preferred embodiment of a general driver that drives n bus wires is exemplified in FIG. 25. The driver exemplified in FIG. 25 may be used to implement differential signaling, multi-level differential signaling, schemes based on superposition signaling codes or other signaling methods. The communication bus 440 comprises n wires 445 and a reference wire 2714. The reference wire 2714 might not be an explicit part of the bus itself. The reference wire 2714 may be, for instance, ground and voltages may be defined with respect to this reference wire 2714. The communication bus may be terminated at the transmitter by resistors 2720 and at the receiver by resistors 2722. The energy source of the driver is the voltage source 2730. The voltage source 2730 supplies a voltage equal to Vdd. The signals that are transmitted on the wires of the bus are defined by the current sources 2740 and the current sources 2742. Each of the current sources 2740 sources a current of strength I_(i) into the ith wire and each of the current sources 2742 sinks a current of strength J_(i) from the ith wire. The numbers I₀, . . . , I_(n−1) and J₀, . . . , d_(n−1) define the current sources 2740 and 2742. Furthermore, they are larger than zero and satisfy Equation 19.

$\begin{matrix} {{\sum\limits_{i = 0}^{n - 1}I_{i}} = {\sum\limits_{i = 0}^{n - 1}J_{i}}} & \left( {{Eqn}.\mspace{11mu} 19} \right) \end{matrix}$

When input of the communication bus is generated according to Equation 1, the numbers I₀, . . . , I_(n−1) and J₀, . . . , J_(n−1) that define the current sources 2740 and 2742 are given by Equation 20.

$\begin{matrix} {{I_{i} = \frac{{z_{i}} + z_{i}}{2}}{J_{i} = \frac{{z_{i}} - z_{i}}{2}}} & \left( {{Eqn}.\mspace{11mu} 20} \right) \end{matrix}$

In Equation 20, z_(i) denotes the i-th position of z. The energy for driving the bus wires is supplied by the voltage source 2730 and when the bus inputs are generated according to Equation 1, the instantaneous power consumed, denoted by P(t), may be written as shown by Equation 21.

$\begin{matrix} {{P(t)} = {{{Vdd} \cdot {p(t)} \cdot {\sum\limits_{i = 0}^{n - 1}I_{i}}} = {{Vdd} \cdot {p(t)} \cdot \frac{{z}_{1}}{2}}}} & \left( {{Eqn}.\mspace{11mu} 21} \right) \end{matrix}$

In Equation 21, ∥z∥₁ denotes the L1-norm of z. In a chip-to-chip communication system, Vdd and p(t) may be fixed and the power consumption determined by ∥z∥₁. In comparing different chip-to-chip communication schemes, one may assume that Vdd is equal to one and p(t) is a constant function that has value one. At the receiver side, the voltages V₀, . . . , V_(n−1) are sensed and these are proportional to the vector z. The constant of proportionality is dependent on the values of the termination resistors 2722 and the loss on the wires 2712. For comparing the power consumption of different signaling methods, one may assume that all resistors 2722 have the same value that is equal to 1. In this case, the observed voltage V_(i) is equal to V_(i)=R·z_(i)=z_(i).

Practical wires have losses. Furthermore, termination resistors 2722 with a value close to 50 ohms are more likely. However, for comparing different signaling schemes, one can neglect these loses since they are the same for the signaling schemes under comparison. For the same reason, the exact value of the termination resistors 2722 is irrelevant for the comparison.

In a preferred embodiment, one may want to perform combining in analog to obtain resilience against SSO. A preferred embodiment for such a driver is exemplified in FIG. 26 for the case of a bus communication system that employs a superposition signaling code comprising two layers. The communication bus 440 comprises n wires 445 and a reference 2814. The reference wire 2814 may not be an explicit part of the bus itself. The reference wire 2814 may be for instance ground and voltages may be defined with respect to this reference wire 2814. The communication bus 440 may be terminated at the transmitter by resistors 2820. The energy source of the driver is the voltage source 2830. The voltage source 2830 supplies a voltage equal to Vdd. The signals that are transmitted on the wires of the bus are defined by the current sources 2840, 2842, 2844 and 2846. The current sources 2840 and 2844 source a current of strength I_(1i) and I_(2i), respectively, into the ith wire. The current sources 2842 and 2846 sink a current of size J_(1i) and J_(2i), respectively, from the ith wire. The numbers I_(1l), . . . , I_(1n), I_(2l), . . . I_(2n), J_(1l), . . . , J_(1n) and J_(2l), . . . , J_(2n) are larger than zero and satisfy Equation 22.

$\begin{matrix} {{{\sum\limits_{i = 1}^{n}I_{1i}} = {\sum\limits_{i = 1}^{n}J_{1\; i}}}{{\sum\limits_{i = 1}^{n}I_{2i}} = {\sum\limits_{i = 1}^{n}J_{2i}}}} & \left( {{Eqn}.\mspace{11mu} 22} \right) \end{matrix}$

When the input of the communication bus is generated according to Equation 2, the numbers I_(1l), . . . , I_(1n), I_(2l), . . . , I_(2n), J_(1l), . . . , J_(1n) and J_(2l), . . . , J_(2n) that define the current sources 2840, 2842, 2844 and 2846 are given by Equation 23.

$\begin{matrix} {{I_{1i} = \frac{{x_{0\; i}} + x_{0i}}{2}}{J_{1i} = \frac{{x_{0\; i}} - x_{0i}}{2}}{I_{2i} = \frac{{x_{1i}} + x_{1i}}{2}}{J_{2i} = \frac{{x_{1\; i}} - x_{1i}}{2}}} & \left( {{Eq}\;{n.\mspace{11mu} 23}} \right) \end{matrix}$

In Equation 23, x_(0i) denotes the ith position of x₀ and x_(1i), denotes the ith position of x₁.

Power Consumption of Multi-Level Differential Signaling

In differential signaling and multi-level differential signaling, the signals transmitted on the bus have the form shown in Equation 24.

$\begin{matrix} {\begin{bmatrix} {s_{0}(t)} \\ {s_{1}(t)} \end{bmatrix} = {\begin{bmatrix} {- 1} \\ 1 \end{bmatrix} \cdot x \cdot {p(t)}}} & \left( {{Eqn}.\mspace{11mu} 24} \right) \end{matrix}$

In Equation 24, x is an element from a pulse-amplitude modulation (PAM) signal constellations S. Other multi-level signal constellations S may be used for multi-level differential signaling. A family of PAM constellations that is often used is defined Equation 25.

$\begin{matrix} {S = {\frac{1}{2} \cdot \left\{ {{{{- M} + 1 + {2{i:i}}} = 0},\ldots\;,{M - 1}} \right\}}} & \left( {{Eqn}.\mspace{11mu} 25} \right) \end{matrix}$

In Equation 25, M denotes the number of levels and M is generally a power of two. The minimum separation between the different levels of the constellation symbols is normalized to one. To generate the multi-level signals for transmission on the bus according to the general driver shown in FIG. 26 one may make use of the fact that any xεS can be written as shown in Equation 26.

$\begin{matrix} {x = {\frac{1}{2}{\sum\limits_{i = 0}^{\log_{2}M}{b_{i}2^{i}}}}} & \left( {{Eqn}.\mspace{11mu} 26} \right) \end{matrix}$

In Equation 26, b_(i) is a number equal to either 1 or −1 and may represent a bit. This also shows that differential signaling can be seen as a special case of a superposition signaling code comprising l=log₂(M) layers and where the i-th basis vector is given by Equation 27.

$\begin{matrix} {x_{i} = {2^{i - 1}\begin{bmatrix} {- 1} \\ 1 \end{bmatrix}}} & \left( {{Eqn}.\mspace{11mu} 27} \right) \end{matrix}$

However, differential signaling is not considered as a superposition signaling code since only two permutations of each of the basis vectors are used. From this, one can deduce that the driver architecture exemplified in FIG. 26 may implement multi-level differential signaling. For multi-level differential signaling, the power consumed in the driver when the separation between the different constellation symbols is 1 is given by Equation 28.

$\begin{matrix} {P_{S} = \frac{M^{2} - 1}{6}} & \left( {{Eqn}.\mspace{11mu} 28} \right) \end{matrix}$ Power Consumption of Superposition Signaling Codes

In a preferred embodiment where a superposition signaling code with two layers is used, the vector z is generated as shown by Equation 29 (which is the same as Equation 6). z=P ₀(x ₀)+P ₁(x ₁)  (Eqn. 29)

The instantaneous power consumed, denoted by P(t), can be written as shown in Equation 30.

$\begin{matrix} {{P(t)} = {{V_{dd} \cdot {p(t)} \cdot \left( {{\sum\limits_{i = 0}^{n - 1}I_{1_{i}}} + {\sum\limits_{i = 0}^{n - 1}J_{1_{i}}}} \right)} = {V_{dd} \cdot {p(t)} \cdot {\frac{{x_{0}}_{1} + {x_{1}}_{1}}{2}.}}}} & \left( {{Eqn}.\mspace{11mu} 30} \right) \end{matrix}$

Per Equation 30, the power consumption is proportional to the sum of L1 norms of x₀ and x₁. When a superposition signaling code with/layers is used, the expression for the instantaneous power consumption becomes as shown in Equation 31.

$\begin{matrix} {{P(t)} = {{V_{dd} \cdot {p(t)} \cdot \left( {{\sum\limits_{i = 0}^{n - 1}I_{1_{i}}} + {\sum\limits_{i = 0}^{n - 1}J_{1_{i}}}} \right)} = {V_{dd} \cdot {p(t)} \cdot \frac{1}{2} \cdot {\sum\limits_{i = 0}^{l - 1}{{x_{i}}_{1}.}}}}} & \left( {{Eqn}.\mspace{11mu} 31} \right) \end{matrix}$

The part that is dependent on the signal constellation S that the superposition signaling code defines is denoted by Equation 32

$\begin{matrix} {P_{S} = {\frac{1}{2} \cdot {\sum\limits_{i = 0}^{l - 1}{{x_{i}}.}}}} & \left( {{Eqn}.\mspace{11mu} 32} \right) \end{matrix}$

Once the noise margin is equal for different signaling methods, a key parameter is the amount of energy expended per bit, E_(b), as in E_(b)=(P_(S)/r).

Comparison of Driver Power Consumption

For multi-level differential signaling the power consumed per bit of the general driver shown in FIG. 26 is given in Table 7 for several values of M.

TABLE 7 M log₂ (S) R E_(b) 2 1 0.5 1.0 4 2 1.0 2.5 8 3 1.5 4.0 16 4 2.0 10.0 32 5 2.5 68.0

The power consumption of the driver for several superposition signaling codes comprising two layers is shown in Table 8.

TABLE 8 n x₀ x₁ log₂ (S) r E_(b) 4 [−1 −1 1 1] [−2 −2 2 2] 5.17 1.29 1.16 5 [−1 1 0 0 0] [−3 3 0 0 0] 8.64 1.73 0.46 6 [−1 −1 −1 1 1 1] [−2 −2 −2 2 2 2] 11.81 1.97 0.76 8 [−1 −1 1 1 0 0 0 0] [−3 −3 3 3 0 0 0 0] 17.43 2.18 0.46 9 [−1 1 0 0 0 0 0 0 0] [−3 3 0 0 0 0 0 0 0] 12.34 1.37 0.32

The power used per transmitted bit of superposition signaling codes is up to a factor of 10 less than that of multi-level differential signaling. A circuit can provide substantial power savings by using superposition signaling codes instead of multi-level differential signaling. Hence superposition signaling codes provide a good resilience against various types of noise while keeping the required driver power consumption low. Furthermore, several embodiments have been disclosed that result in low hardware complexity.

Additional Applications of Superposition Signaling Codes

As one of skill in the art will recognize superposition signaling codes provide several advantages when used to transmit information in communication systems. Superposition signaling codes are useful when information is transmitted in quantities of k bits on multiple physical channels of communication. The use of superposition signaling codes provides resilience against several types of noise and allows for very low power communications.

Each of the physical channels is referred to as a wire in this application and the set of wires constitutes the communication bus. However, one should keep in mind that these wires may include the whole path from transmitting IC to receiving IC. This implies that for instance bond wires, strip lines, pins, etc. may be included in this physical channel constituting the wire. Furthermore, these ICs may be located in the same device, may be stacked on top of each other in a package-on-package configuration, or even be integrated on the same die. In the latter case the ICs are really two components of the same chip.

Each of the wires constituting the communication bus may be a medium carrying electrical signals. However, the wire may also comprise an optical fiber that carries electrical signals or a combination. Furthermore, a wire may in part carry electrical signals and in another part carry optical signals. Another possibility is that communication between two ICs takes place wireless. What is important that in the end two ICs communicate with another over a plurality of wires where the wire is understood to be a very general term for a path between transmitting and receiving IC.

The preferred embodiments mostly illustrate the use of superposition signaling codes for chip-to-chip communications. However, this should not been seen in any way to limit the scope of present invention. The methods disclosed in this application may also be used in other communication settings such as wireless communications or optical communications. Furthermore, superposition signaling codes may also be used in storage of information. 

What is claimed is:
 1. A method for transmitting information over a communication bus, the method comprising: receiving a first set of signals representing the information as a plurality of bits; mapping, using an encoder, a first set of the plurality of bits to a first vector in a first set of vectors, the first set of vectors part of a first permutation modulation code, wherein the first set of the plurality of bits are mapped to the first vector according to a first set of indices, the first set of indices generated at least in part by an index generator operating on the first set of the plurality of bits; mapping using the encoder, a second set of the plurality of bits not in the first set of the plurality of bits to a second vector in a second set of vectors, the second set of vectors part of a second permutation modulation code, wherein the second set of the plurality of bits are mapped to the second vector according to a second set of indices, the second set of indices generated at least in part by the index generator operating on the second set of the plurality of bits; and providing a second set of signals for transmission over the communication bus, wherein the second set of signals is based on a vector sum of the first and second vectors, wherein the vector sum forms a codeword of a superposition signaling code that is based on the first and second sets of vectors, and wherein each signal of the second set is transmitted over a respective wire of the bus.
 2. The method of claim 1, wherein a pin-efficiency is larger than
 1. 3. The method of claim 1, wherein a pin-efficiency is larger than 1.5.
 4. The method of claim 1, wherein the index generator generates the first and second sets of indices corresponding to a permutation of a basis vector.
 5. The method of claim 1, wherein the mapping includes combining a plurality of secondary codewords to form the codeword of the superposition signaling code, wherein the plurality of secondary codewords are combined as voltages.
 6. The method of claim 1, wherein the vector sum is formed by combining currents associated with the first and second vectors.
 7. The method of claim 1, wherein the vector sum is formed by wiring connections associated with the first and second vectors.
 8. The method of claim 1, wherein a number of components of the first vector is equal to a number of components of the second vector.
 9. The method of claim 1, wherein a number of components of the first vector is not equal to a number of components of the second vector.
 10. The method of claim 1, further comprising: mapping, using the encoder, a third set of the plurality of bits not in the first or second set to a third vector in a third set of vectors, the third set of vectors selected from a third permutation modulation code, wherein the second set of signals is based on a vector sum of the first, second, and third vectors.
 11. A method for encoding a predetermined number, k, of bits of information into a codeword of a superposition signaling code that is defined by a first basis vector of predetermined size, n₁, and a second basis vector of predetermined size, n₂, the method comprising: providing a first encoder for a permutation modulation code defined by the first basis vector, the first encoder including a first index generator; providing a second encoder for a permutation modulation code defined by the second basis vector, the second encoder including a second index generator; receiving a first set of signals representing the k bits of information; dividing the signals into a first part and a second part wherein the first part represents a predetermined number of k₁ bits and the second part represents a predetermined number of k₂ bits; generating a first permutation modulation codeword by applying the first part to the first index generator of the first encoding circuit; generating a second permutation modulation codeword by applying the second part to the second index generator of the second encoding circuit; and combining the first permutation modulation codeword of the first encoding circuit and the second permutation modulation codeword of the second encoding circuit, wherein the combining is based on by a superposition of the first and second permutation modulation codewords.
 12. The method of claim 11, wherein the superposition signaling code has a pin-efficiency of at least
 1. 13. The method of claim 11, wherein the superposition signaling code has a pin-efficiency of at least 1.6.
 14. The method of claim 11, wherein the combining is performed by superimposing currents in wires of a multi-wire communication bus.
 15. The method of claim 11, wherein a number of components of the first vector is equal to a number of components of the second vector.
 16. The method of claim 11, wherein a number of components of the first vector is not equal to a number of components of the second vector.
 17. A communication system for transmitting a superpositioning codeword belonging to a superposition signaling code, the communication system comprising: a communication bus with an integer, n, independent signal paths for signal transmission; means for generating a first codeword belonging to a first permutation-modulation codeword set; means for generating a second codeword belonging to a second permutation-modulation codeword set; means for generating the superpositioning codeword representing a sum of the first and second codewords; and means for transmitting the superposition codeword over the communication bus.
 18. The communication system of claim 17, wherein the communication system transmits with a pin-efficiency of
 1. 19. The communication system of claim 17, wherein the communication system transmits with a pin-efficiency of 1.6. 